Statistical Models for Frame-Semantic Parsing
نویسنده
چکیده
We present a brief history and overview of statistical methods in frame-semantic parsing – the automatic analysis of text using the theory of frame semantics. We discuss how the FrameNet lexicon and frameannotated datasets have been used by statistical NLP researchers to build usable, state-of-the-art systems. We also focus on future directions in frame-semantic parsing research, and discuss NLP applications that could benefit from this line of work. 1 Frame-Semantic Parsing Frame-semantic parsing has been considered as the task of automatically finding semantically salient targets in text, disambiguating their semantic frame representing an event and scenario in discourse, and annotating arguments consisting of words or phrases in text with various frame elements (or roles). The FrameNet lexicon (Baker et al., 1998), an ontology inspired by the theory of frame semantics (Fillmore, 1982), serves as a repository of semantic frames and their roles. Figure 1 depicts a sentence with three evoked frames for the targets “million”, “created” and “pushed” with FrameNet frames and roles. Automatic analysis of text using framesemantic structures can be traced back to the pioneering work of Gildea and Jurafsky (2002). Although their experimental setup relied on a primitive version of FrameNet and only made use of “exemplars” or example usages of semantic frames (containing one target per sentence) as opposed to a “corpus” of sentences, it resulted in a flurry of work in the area of automatic semantic role labeling (Màrquez et al., 2008). However, the focus of semantic role labeling (SRL) research has mostly been on PropBank (Palmer et al., 2005) conventions, where verbal targets could evoke a “sense” frame, which is not shared across targets, making the frame disambiguation setup different from the representation in FrameNet. Furthermore, it is fair to say that early research on PropBank focused primarily on argument structure prediction, and the interaction between frame and argument structure analysis has mostly been unaddressed (Màrquez et al., 2008). There are exceptions, where the verb frame has been taken into account during SRL (Meza-Ruiz and Riedel, 2009; Watanabe et al., 2010). Moreoever, the CoNLL 2008 and 2009 shared tasks also include the verb and noun frame identification task in their evaluations, although the overall goal was to predict semantic dependencies based on PropBank, and not full argument spans (Surdeanu et al., 2008; Hajič et al., 2009). The SemEval 2007 shared task (Baker et al., 2007) attempted to revisit the frame-semantic analysis task based on FrameNet. It introduced a larger FrameNet lexicon (version 1.3), and also a larger corpus with full-text annotations compared to prior work, with multiple targets annotated per sentence. The corpus allowed words and phrases with noun, verb, adjective, adverb, number, determiner, conjunction and preposition syntactic categories to serve as targets and evoke frames, unlike any other single dataset; it also allowed targets from different syntactic categories share frames, and therefore roles. The repository of semantic role types was also much richer than PropBankstyle lexicons, numbering in several hundreds. Most systems participating in the task resorted to a cascade of classifiers and rule-based modules: identifying targets (a non-trivial subtask), disambiguating frames, identifying potential arguments, and then labeling them with roles. The system described by Johansson and Nugues (2007) performed the best in this shared task. Next, we focus on its performance, and subsequent improvements made by the research community on this task.
منابع مشابه
Semi-Supervised and Latent-Variable Models of Natural Language Semantics
This thesis focuses on robust analysis of natural language semantics. A primary bottleneck for semantic processing of text lies in the scarcity of high-quality and large amounts of annotated data that provide complete information about the semantic structure of natural language expressions. In this dissertation, we study statistical models tailored to solve problems in computational semantics, ...
متن کاملAutomatic Labeling of Semantic Roles Qualifying Exam Proposal
The problem of linking syntactic constituents of a sentence to semantic roles is an essential part of many natural language processing tasks. The research outlined here aims to develop a statistical approach to the problem, extending the methodology that has been very successful in statistical parsing one step closer to language understanding. Both speci c and more abstract semantic roles are c...
متن کاملFrame-Semantic Parsing
Frame semantics is a linguistic theory that has been instantiated for English in the FrameNet lexicon. We solve the problem of frame-semantic parsing using a two-stage statistical model that takes lexical targets (i.e., content words and phrases) in their sentential contexts and predicts frame-semantic structures. Given a target in context, the first stage disambiguates it to a semantic frame. ...
متن کاملبرچسبزنی خودکار نقشهای معنایی در جملات فارسی به کمک درختهای وابستگی
Automatic identification of words with semantic roles (such as Agent, Patient, Source, etc.) in sentences and attaching correct semantic roles to them, may lead to improvement in many natural language processing tasks including information extraction, question answering, text summarization and machine translation. Semantic role labeling systems usually take advantage of syntactic parsing and th...
متن کاملEvent Extraction as Frame-Semantic Parsing
Based on the hypothesis that frame-semantic parsing and event extraction are structurally identical tasks, we retrain SEMAFOR, a stateof-the-art frame-semantic parsing system to predict event triggers and arguments. We describe how we change SEMAFOR to be better suited for the new task and show that it performs comparable to one of the best systems in event extraction. We also describe a bias i...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2014